NodeWiz: Fault-tolerant grid information service
نویسندگان
چکیده
Large scale grid computing systems may provide multitudinous services, from different providers, whose quality of service will vary. Moreover, services are deployed and undeployed in the grid with no central coordination. Thus, to find out the most suitable service to fulfill their needs, or to find the most suitable set of resources on which to deploy their services, grid users must resort to a Grid Information Service (GIS). This service allows users to submit rich queries that are normally composed of multiple attributes and range operations. The ability to efficiently execute complex searches in a scalable and reliable way is a key challenge for current GIS designs. Scalability issues are normally dealt with by using peer-to-peer technologies. However, the more reliable peer-to-peer approaches do not cater for rich queries in a natural way. On the other hand, approaches that can easily support these rich S. Basu · S. Banerjee · P. Sharma · S.-J. Lee Hewlett-Packard Laboratories, Palo Alto, CA 94304, USA S. Basu e-mail: [email protected] S. Banerjee e-mail: [email protected] P. Sharma e-mail: [email protected] S.-J. Lee e-mail: [email protected] L. B. Costa · F. Brasileiro (B) Universidade Federal de Campina Grande, 58.109-970, Campina Grande, Paraíba, Brazil e-mail: [email protected] L. B. Costa e-mail: [email protected] queries are less robust in the presence of failures. In this paper we present the design of NodeWiz, a GIS that allows multi-attribute range queries to be performed efficiently in a distributed manner, while maintaining load balance and resilience to failures.
منابع مشابه
Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملBuilding Fault-Tolerant Consistency Protocols for an Adaptive Grid Data-Sharing Service
We address the challenge of sharing large amounts of numerical data within computing grids consisting of clusters federation. We focus on the problem of handling the consistency of replicated data in an environment where the availability of storage resources dynamically changes. We propose a software architecture which decouples consistency management from fault-tolerance management. We illustr...
متن کاملArchitectural Plan for Constructing Fault Tolerable Workflow Engines Based on Grid Service
In this paper the design and implementation of fault tolerable architecture for scientific workflow engines is presented. The engines are assumed to be implemented as composite web services. Current architectures for workflow engines do not make any considerations for substituting faulty web services with correct ones at run time. The difficulty is to rollback the execution state of the workflo...
متن کاملTowards a Robust and Fault-Tolerant Multicast Discovery Architecture for Global Computing Grids
Global grid systems with potentially millions of services require a very effective and efficient service discovery/location mechanism. Current grid environments, due to their smaller size, rely mainly on centralised service directories. Large-scale systems need a decentralised service discovery system that operates reliably in a dynamic and error-prone environment. Work has been done in studyin...
متن کاملA Scalable Byzantine Fault Tolerant Service in Grid System
This paper describes the design, implementation and usage of a secure scalable Byzantine fault tolerant MDS system in the Grid. The scalable Byzantine fault tolerant MDS system provides a hierarchy GIIS servers, a local GIIS domain can require the resource it needs from remote GIIS domain. By using the statemachine replication approach and quorum system technique, the scalable Byzantine fault t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Peer-to-Peer Networking and Applications
دوره 2 شماره
صفحات -
تاریخ انتشار 2009